Rank | Count | Beginning |
---|---|---|
53604 | 9202 | Në |
45480 | 3287 | Më |
1980 | 2554 | Ai |
35311 | 1992 | Kjo |
71488 | 1936 | Për |
69023 | 1453 | Pas |
65102 | 1416 | Një |
63538 | 1177 | Nga |
40962 | 1166 | Ky |
91663 | 1070 | Të |
23070 | 1025 | Gjate |
30736 | 978 | Ka |
77003 | 946 | Por |
18742 | 932 | Është |
4593 | 927 | Ajo |
87386 | 760 | Si |
88310 | 750 | Sipas |
34059 | 691 | Këto |
39881 | 689 | Kur |
16286 | 684 | E |
7554 | 655 | Ata |
16377 | 623 | Edhe |
27623 | 561 | I |
15528 | 490 | Duke |
76141 | 474 | Po |
32994 | 461 | Kështu |
80786 | 437 | Që |
94885 | 417 | U |
14230 | 393 | Disa |
26794 | 375 | Historia |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV